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The Lomb-Scargle periodogram is a common tool in the frequency analysis of unequally spaced data equivalent to least-squares fitting 
of sine waves. We give an analytic solution for the generalisation to a full sine wave fit, including an offset and weights fitting). 
Compared to the Lomb-Scargle periodogram, the generalisation is superior as it provides more accurate frequencies, is less susceptible 
to aliasing, and gives a much better determination of the spectral intensity. Only a few modifications are required for the computation 
and the computational effort is similar. Our approach brings together several related methods that can be found in the literature, viz. 
the date-compensated discrete Fourier transform, the floating-mean periodogram, and the "spectral significance" estimator used in the 
SigSpec program, for which we point out some equivalences. Furthermore, we present an algorithm that implements this generalisation 
for the evaluation of the Keplerian periodogram that searches for the period of the best-fitting Keplerian orbit to radial velocity data. 
The systematic and non-random algorithm is capable of detecting eccentric orbits, which is demonstrated by two examples and can 
be a useful tool in searches for the orbital periods of exoplanets. 
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1. Introduction 

The Lomb-Scargle periodogram (Scargle 1982) is a widely used 
tool in period searches and frequency analysis of time series. It is 
equivalent to fitting sine waves of the formy = a cos cot+b sin a>t. 
While standard fitting procedures require the solution of a set of 
linear equations for each sampled frequency, the Lomb-Scargle 
method provides an analytic solution and is therefore both con- 
venient to use and efficient. The equation for the periodogram 
was given by Earning (1963), and also Lomb (1976) and Scargle 
(1982), who furthermore investigated its statistical behaviour, 
especially the statistical significance of the detection of a signal. 
For a time series (ti, y,) with zero mean (y = 0), the Lomb- 
Scargle periodogram is defined as (normalisation from Lomb 
1976): 
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where the hats are used in this paper to symbolise the classical 
expressions. The parameter r is calculated via 
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However, there are two shortcomings. First, the Lomb- 
Scargle periodogram does not take the measurement eiTors into 
account. This was solved by introducing weighted sums by 
GilHland & BaHunas (1987) and Irwin et al. (1989) (equiva- 
lent to the generalisation to a AO- Second, for the analysis 



the mean of the data was subtracted, which assumes that the 
mean of the data and the mean of the fitted sine function are the 
same. One can overcome this assumption with the introduction 
of an offset c, resulting in a further generalisation of this peri- 
odogram to the equivalent of weighted full sine wave fitting; i.e., 
y = a cos (Lit + bsinajt + c. Gumming et al. (1999), who called 
this generalisation "floating-mean periodogram", argue that this 
approach is superior: "... the Lomb-Scargle periodogram fails to 
account for statistical fluctuations in the mean of a sampled si- 
nusoid, making it non-robust when the number of observations 
is small, the sampling is uneven, or for periods comparable to 
or greater than the duration of the observations." These authors 
provided a formal definition and also a sophisticated statistical 
treatment, but do not use an analytical solution for the computa- 
tion of this periodogram. 

Basically, analytical formulae for a full sine, least-squares 
spectrum have already been given by FeiTaz-Mello (1981), call- 
ing this date-compensated discrete Fourier transform (DGDFT). 
We prefer to adopt a notation closely related to the Lomb- 
Scargle periodogram calling it the generalised Lomb-Scargle pe- 
riodogram (GLS). Shrager (2001) tries for such an approach but 
did not generalise the parameter f in Eq. (3). Moreover, our gen- 
eralised equations, which are derived in the following (Sect. 2), 
have a comparable symmetry to the classical ones and also allow 
us to point out equivalences to the "spectral significance" estima- 
tor used in the SigSpec program by Reegen (2007) (Sect. 4). 



2. The generalised Lomb-Scargle periodogram 
(GLS) 

The analytic solution for the generalised Lomb-Scargle peri- 
odogram can be obtained in a straightforward manner in the 



2 



M. Zechmeister and M. Kurster: The generalised Lomb-Scargle periodogram 



same way as outlined in Lomb (1976). Let be the mea- 
surements of a time series at time and with errors cr,. Fitting a 
full sine function (i.e. including an offset c): 

yit) = a cos oj? + sin oj? + c 

at given frequency a> (or period P = ^) means to minimise the 
squared difference between the data yi and the model fimction 
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are the normaUsed weights ^ Minimisation leads to a system of 
(three) linear equations whose solution is derived in detail in 
Appendix A. 1 . Furthermore, it is shown in A. 1 that the relative 
-reduction p(w) as a function of frequency oj and normalised 
to unity hy (the;^f^ for the weighted mean) can be written as: 
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with: 

D{oS) = CC-SS -CS'^ 
and the following abbreviations for the sums: 
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Note that sums with hats correspond to the classical sums. 
W ■ YY = Xq is simply the weighted sum of squared devia- 
tions from the weighted mean. The mixed sums can also be writ- 
ten as a weighted covariance CoVx,y = X - X ■ Y/W = 
E(x ■ y) - WE(x)E(y) where E is the expectation value, e.g. 

YS = Co V,,,sin«(- 

With the weighted mean given by y = 2 = Y Eqs. (10)- 
(12) can also be written as: 



YY =2'"''<^'-30' 
YCico)^ ^ Wiiyt - y) cos coti 
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(16) 
(17) 
(18) 



' For clarity the bounds of the summation arc suppressed in the fol- 
lowing notation. They are always the same (i = 1,2,..., A'). 



So the sums YC and YS use the weighted mean subtracted data 
and are calculated in the same way as for the Lomb-Scargle pe- 
riodogram (but with weights). 

The generalised Lomb-Scargle periodogram p(cl>) in Eq. (4) 
is normalised to unity and therefore in the range of < p < 1, 
with p - indicating no improvement of the fit and p = I a 
"perfect" fit (100% reduction of;^^^ oix^ = 0). 

As the full sine fit is time-translation invariant, there is 
also the possibility to introduce an arbitrary time reference 
point T {ti — > ti - t; now, e.g. CC = 2 cos^ a>(f, - r) - 
(2 Wi cos co{ti - t))^), which will not affect the x^ of the fit. If 
this parameter r is chosen as 
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2 Wi cos 2a)ti - [(2 Wi cos cot,)^ - (2 vv, sin c<jf,)^j 

the interaction term in Eq. (5) disappears, CSj = X Wi cos Cjj{ti - 
r) sin 0}{ti - r) - 2 w,- cos co{ti - r) 2 sin a>(?,- - r) = (proof 
in Appendix A. 2) and in this case we append the index t to the 
time dependent sums. The parameter t{co) is determined by the 
times ti and the measurement errors cr, for each frequency w. So 
when using r as defined in Eq. (19) the periodogram in Eq. (5) 
becomes 
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Note that Eq. (20) has the same form as the Lomb-Scargle pe- 
riodogram in Eq. (1) with the difference that the errors can be 
weighted (weights w, in all sums) and that there is an additional 
second term in CCr, SSj, CSj and tan2wT (Eqs. (13)-(15) and 
Eq. (19), respectively) which accounts for the floating mean. 

The computational effort is similar as for the Lomb-Scargle 
periodogram. The incorporation of the offset c requires only two 
additional sums for each frequency co (namely S - J] Wi sin coti 
and C = cos coti or Sr and C^ respectively). The effort is 
even weaker when using Eq. (5) with keeping CS instead of us- 
ing Eq. (20) with the parameter r introduced via Eq. (19) which 
needs an extra preceding loop in the algorithm. If the errors are 
taken into account as weights, also the multiplication with w, 
must be done. 

For fast computation of the trigonometric sums the algorithm 
of Press & Rybicki (1989) can be applied, which has advan- 
tages in the case of large data sets and/or many frequency steps. 
Another possibility are trigonometric recurrences^ as described 
in Press et al. (1992). Note also that the first sum in SS can be 
expressed by SS = I- CC. 



3. Normalisation and False-Alarm probability (FAP) 

There were several discussions in the literature on how to nor- 
malise the periodogram. For the detailed discussion we refer to 
the key papers by Scargle (1982), Horne & Baliunas (1986), 
Koen (1990) and Gumming et al. (1999). The normalisation be- 
comes important for estimations of the false-alarm probabiUty 
of a signal by means of an analytic expression. Lomb (1976) 

showed that if data are Gaussian noise, the terms YC /CC and 

YS /SS in Eq. (1) are -distributed and therefore the sum of 
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where Aoj is the frequency step. 
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both (which is oc p) is ;^'^-distributed with two degrees of free- 
dom. This holds for the generalisation in Eq. (20) and also be- 
comes clear from the definition of the periodogram in Eq. (4) 

piaf) = -2-^ — where for Gaussian noise the difference in the 

numerator;(fQ-;(f^(w) is;tr^-distributedwith V = {N-l)-iN-3) = 
2 degrees of freedom. 

The pioS) can be compared with a known noise level p„ (ex- 
pected from the a priori known noise variance or population 
variance) and the normalisation of p(aji) to p„ 



Pn = 



P(^) 
Pn 



(21) 



can be considered as a signal to noise ratio (Scargle 1982). 
However, this noise level is often not known. 

Alternatively, the noise level may be estimated for Gaussian 
noise from Eq. (4) to be p„ = which leads to: 



(22) 



and is the analogon to the normalisation in Home & Baliunas 
(1986)^. So if the data are noise, P = 1 is the expected value. 
If the data contains a signal, P » 1 is expected at the signal 
frequency. However, this power is restricted to < P < 

But if the data contains a signal or if errors are under- or 
overestimated or if intrinsic variability is present, then p„ = 
may not be a good uncorrelated estimator for the noise level. 
Gumming et al. (1999) suggested to estimate the noise level a 
posteriori with the residuals of the best fit and normalised the 
periodogram as 
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where the index "best" denotes the corresponding values of the 
best fit (pbest = P(t^est))- 

Statistical fluctuations or a present signal may lead to a larger 
periodogram power. To test for the significance of such a peak in 
the periodogram the probability is assessed that this power can 
arise purely from noise. Gumming et al. (1999) clarified that the 
different normalisations result in different probability functions 
which are sunmiarised in Table 1. Note that the last two proba- 
bility values are the same for the best fit {zo = Zbest): 



Prob(z > Zbest) = 1 + 



2Zbest 

N-3 
1 



-(N-3)/2 
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1 - Pbest 
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, 1 - Pbest I 
= Prob(/5 > /5best) = Prob(P > Pbest)- 

Furthermore Baluev (2008) pointed out that the power definition 
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as a nonlinear (logarithmic) scale for;^^ has an exponential dis- 
tribution (similiar to P„) 



Prob(Z>Zbest) = e-^ = ^^ 



2i, A\(A'-3)/2 



: Prob(p > Pbest). 



Table 1. Probabilities that a periodogram power (P„, p, P or z) can 
exceed a given value (/'„,o, Po> Po or zo) for different normalizations 
(from Cumming et al. 1999). 



Reference level 


Range 


Probability 




population variance 
sample variance 


P„6 [0,cx,) 
P 6 [0, 1] 


Prob(P„ > P,,o) 
Prob(p > po) = 


= exp(-P„,o) 
(1-/^0)- 




P e [0, ^] 


Prob(P > Po) = 




residual variance 


Z 6 [0, cx>) 


Prob(z > Zo) = 1 


1 + 



Since in period search with the periodogram we study a 
range of frequencies, we are also interested in the significance 
of one peak compared to the peaks at other frequencies rather 
than the significance of a single frequency. The false alarm prob- 
ability (FAP) for the period search in a frequency range is given 
by 

FAP = 1 - [1 - Prob(z > zo)]^ (24) 

where M is the number of independent frequencies and may be 
estimated as the number of peaks in the periodogram. The width 
of one peak is (5/ » j (^frequency resolution). So in the fre- 
quency range A/ - fi - fi there are approximately M = ^ 
peaks and in the case /i <c fx one can write M ^ Tfz (Gumming 
2004). FinaUy, for low FAP values the following handy approx- 
imation for Eq. (24) is valid: 



FAP « M • Prob(z > zo) for FAP <«cl. 



(25) 



Another possibility to access M and the FAP are Monte 
Garlo or bootstrap simulations in order to determine how often a 
certain power level (zo) is exceeded just by chance. Such numer- 
ical calculation of the FAP are much more time-consuming than 
the actual computation of the GLS. 

4. Equivalences between the GLS and SigSpec 
(Reegen 2007) 

Reegen (2007) developed a method, caUed SigSpec, to deter- 
mine the significance of a peak at a given frequency (spectral sig- 
nificance) in a discrete Fourier transformation (DFT) which in- 
cludes a zero mean correction. We will recapitulate some points 
from his paper in an adapted and shortened way in order to show 
several equivalences and to disentangle different notations used 
by us and used by Reegen (2007). For a detailed description we 
refer to the original paper. Briefly, approaching from Fourier the- 
ory Reegen (2007) defined the zero mean corrected Fourier co- 
efficients* 



azMioj) 



^zm(w) 



^^y,cosa;f,-^^y,-2cos 
i^y,sina>?,-ij^y,-;^ 



OJti 



sm OJti 



which obviously correspond to YC and YS in Eqs. (11) and (12). 
Their variances are given by 

{4m) = ^ [Z " ^ (Z "^^'f 



' These authors called it the normalization with the sample variance 
CTp. Note that p(cl)) is already normalized with;^'^. For the unweighted 
case (Wi = jj, cr^ = j^YY) one can write Eq. (22) with Eq. (20) as 

nw) - [cc + IT]- 

Here only the unweighted case is discussed (w; = ^). Reegen 
(2007) also gives a generalization to weighting. 
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sin^ ojti 



sin ojt: 







The precise value of these variances depends on the temporal 
sampling. These variances can be expressed as (al^j) = 

Consider now two independent Gaussian variables whose 
cumulative distribution function (CDF) is given by: 



A Gaussian distribution of the physical variable >', in the time 
domain will lead to Gaussian variables azu and bzM which in 
general are correlated. A rotation of Fourier Space by phase Oq 



tan 200 = 



A' 2 sin 2a>ti - 2 J] cos wf; 2 sin wf; 



N'Z, cos Icoti - (2 cos a)ti)^ + sin a>r,)^ 

transforms azM and bzM into uncorrelated coefficients a and j8 
with vanishing co variance. Indeed, car from Eq. (19) and 6q have 
the same value, but t is applied in the time domain, while Oo is 
applied to the phase 6 in the Fourier domain. It is only mentioned 
here that the resulting coefficients 2a and 2j8 correspond to YCt 
and 75 T. 

Finally, Reegen (2007) defines as the spectral significance 
sig(Q;,jS|w) := -log(^(a,p\cij) and writes: 



sig(azM, bzMlo)) 



N\oge 



azM cos 6>o + bzM sin Op 
ao 

azM cos 00 - bzM sin 0o 
Jo 



(26) 



where 



ao := a 2N 



= y ^ X cos2(wfi - 6*0) - cos(wf; - 6»o)]' 



= ^-^[^Yj - ^o) - \Yj sinC'^fi - ^'o)]' 

are called normaUsed semi-major and semi-minor axes. Note 
ffiat al ~ ICCr and 0^ -288-,. 

Reegen (2007) states that this gives as accurate frequencies 
as do least squares. However, from this derivation it is not clear 
ffiat ffiis is equivalent. But when comparing Eq. (26) to Eq. (20) 
with using FC^ = YC cos ojt+YS sin ojt and YS-^ = YS cos ojt- 
YC sin COT. 



pUiS) - — 
^ YY 



(YC cos MT + YS sin lot) 



CCr 

(YS cosojT-YCsmojT)^ 



SSr 

ffie equivalence of the GLS and ffie spectral significance estima- 
tor in SigSpec (and with it to least squares) is evident: 



sig(azM, bzM\co) = 



YY-Nloge 
2</) 



• p(<o). 



The two indicators differ only in a normalisation factor, which 

becomes loge, when (y^"^ is estimated with the sample vari- 
ance ^y^^ - j^YY. Therefore, SigSpec gives accurate fre- 
quencies just like least squares. But note ffiat the Fourier am- 
plitude is given by the sum of the squared Fourier coefficient 
A2 = fl|^ + bl^ = 4q'2 + 4/32 ~ FC^ + = YCl + YSl while 

999 YC~ YS^ 

the least-squares fitting amplitude is A - a + b - + -j^ 
(see A.l). 

The comparison with SigSpec ofi'ers anoffier point of view. 
It shows how the GLS is associated to Fourier ffieory and how it 
can be derived from the DFT (discrete Fourier transform) when 
demanding certain statistical properties such as simple statistical 
behaviour, time-translation invariance (Scargle 1982) and vary- 
ing zero mean. The shown equivalences allow vice versa to apply 
several of Reegen's (2007) conclusions to the GLS, e.g. that it is 
less susceptible to aliasing or that the time domain sampUng is 
taken into account in ffie probability distribution. 

5. Application of tlie GLS to tlie Keplerian 
periodogram (Keplerian fits to radial velocity 
data) 

The search for the best-fitting sine function is a multidimen- 
sional ;t^^-minimisation problem wiffi four parameters: period P, 
amplitude A, phase (f and offset c (or frequency co = 27r/P, a, 
b and c). At a given frequency cj ffie best-fitting parameters A, 
<p and c can be computed inmiediately by an analytic solution 
revealing the global optimum for this three dimensional param- 
eter subspace. But involving the frequency leads to a lot of local 
optima (minima in x^) as visualised by the numbers of max- 
ima in ffie generaUsed Lomb-Scargle periodogram. With step- 
ping through frequency co one can pick up ffie global optimum in 
the four dimensional parameter space. That is how period search 
with the periodogram works. Because an analytic solution (im- 
plemented in the GLS) can be employed partially, there is no 
need for stepping through all parameters to explore ffie whole 
parameter space for ffie global optimum. 

This concept can be transferred to the Keplerian peri- 
odogram which can be applied to search stellar radial veloc- 
ity data for periodic signals caused by orbiting companions and 
measured from spectroscopic Doppler shifts. The radial velocity 
curve becomes more non-sinusoidal for a more eccentric orbit 
and its shape depends on six orbital elements^: 

y constant system radial velocity 

K radial velocity amplitude 

vy longitude of periastron 

e eccentricity 

To periastron passage 

P period. 

In comparison to ffie full sine fit ffiere are two more param- 
eters to deal with which compUcates ffie period search. An ap- 
proach to simplify the Keplerian orbit search is to use the GLS 
to look for a periodic signal and use it for an initial guess. But 
choosing ffie best-fitting sine period does not necessarily lead to 
ffie best-fitting Keplerian orbit. So for finding the global opti- 
mum the whole parameter space must be explored. 



K = ^ 



the inclination. 



y -^=^ with a the semi-major axis of the stellar orbit and (' 
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The Keplerian periodogram (Cumming 2004), just like the 
GLS, shows how good a trial period (frequency (o) can model 
the observed radial velocity data and can be defined as (x^- 
reduction): 



Instead of the sine function, the function 

RV(t) - y + K[e cos m + cos(v(f) + m)] 



(27) 



which describes the radial reflex motion of a star due to the 
gravitational pull of a planet, serves as the model for the ra- 
dial velocity curve. The time dependence is given by the true 
anomaly v which furthermore depends on three orbital parame- 
ters {vit, P, e, Tq)). The relation between v and time t is: 



V 

tan - = 
2 



1 



e E 
— tan — 
e 2 



E-e&mE = M = 2n- 



To 



(28) 



where E and M are called eccentric anomaly and mean anomaly, 
respectively^. Eq. (28), called Kepler's equation, is transcendent 
meaning that for a given time / the eccentric anomaly E cannot 
be calculated expUcitly. 

For the computation of a Keplerian periodogram is to be 
minimised with respect to five parameters at a fixed trial fre- 
quency o). Similar to the GLS there is no need for stepping 
through all parameters. With the substitutions c = y + Ke cos xjj, 
a = K cos w and b = -K sin m Eq. (27) can be written as: 

RVit) = c + acos vit) + b sin vit) 

and with respect to the parameters a, b and c the analytic solution 
can be employed as described in Sect. 2 for known v (instead 
of ojt) . So for fixed P, e and Jq the true anomalies v, can be 
calculated and the GLS from Eq. (5) (now using v, instead of cjti) 
can be applied to compute the maximum -reduction ipioS)), 
here called pejf^ioj). Stepping through e and Tq yields the best 
Keplerian fit at frequency co: 

PKspio)) = maxpej^io)) 

as visualised in the Keplerian periodogram. Finally, with step- 
ping through the frequency, like for the GLS, one will find the 
best-fitting Keplerian orbit having the overall maximum power: 

/'Kep(^^est) = maxpKep(^j) 

0) 

There exist a series of tools (or are under development) us- 
ing genetic algorithms or Bayesian techniques for fast searches 
for the best Keplerian fit (Ford & Gregory 2007; Balan & Lahav 
2008). The algorithm, presented in this section, is not further 
optimised for speed. But it works well, is easy to implement 
and is robust since in principle it cannot miss a peak if the 3 
dimensional grid for e, Jq and is sufficiently dense. A reU- 
able algorithm is needed for the computation of the Keplerian 
periodogram which by definition yields the best fit at fixed fre- 
quency and no local -minima. O'Toole et al. (2007, 2009) de- 
veloped an algorithm called 2DKLS (two dimensional Kepler 
Lomb-Scargle) that works on grid of period and eccentricity and 



The following expressions are also used frequently: sinv = 



seems to be similar to ours. But the possibility to use partly an 
analytic solution or the need for stepping is not mentioned by 
these authors. 

The effort to compute the Keplerian periodogram with the 
described algorithm is much stronger in comparison to the GLS. 
There are three additional loops: two loops for stepping through 
e and Jq and one for the iteration to solve Kepler's equation. 
Contrary to the GLS it is not possible to use recurrences or the 
fast computation of the trigonometric sums mentioned in Sect. 2. 

However we would like to outhne some possibihties for tech- 
nical improvements for a faster search. The first concerns the 
grid size. We choose a regular grid for e, To and w. While this is 
adequate for the frequency w, as we discuss later in this sec- 
tion, there might be a more appropriate e-To grid, e.g. a less 
dense grid size for Jq at lower eccentricities. A second possi- 
bility is to reduce the iterations for solving Kepler's equation by 
using the eccentric anomalies (or differentially corrected ones) 
as initial values for the next slightly different grid point. This 
can save several ten percent of computation time, in particular in 
dense grids and at high eccentricities. A third idea which might 
have a high potential for speed up is to combine our algorithm 
with other optimisation techniques (e.g. Levenberg-Marquardt 
instead of pure stepping) to optimise the remaining parameters 
in the function pejj^oS). A raw grid would provide the initial val- 
ues and the optimisation technique would do the fine adjustment. 

To give an example Fig. 1 shows RV data for the M 
dwarf GJ 1046 (Kurster et al. 2008) along with the best-fitting 
Keplerian orbit iP = 168.8 d, e = 0.28). Figure 2 shows the 
Lomb-Scargle, GLS and Keplerian periodograms. Because a 
Keplerian orbit has more degrees of freedom it always has the 
highest reduction (0 < /?ls < /^gls < /5Kep,e<o.6 < /'Kep < 1). 

As a comparison the Keplerian periodogram restricted to 
e < 0.6 is also shown in Fig. 2. At intervals where pxep ex- 
ceeds ;»e<o.6 the contribution is due to very eccentric orbits. Note 
that the Keplerian periodogram obtains more structure when the 
search is extended to more eccentric orbits. Therefore the evalu- 
ation of the Keplerian periodogram needs a higher frequency res- 
olution (this effect can also be observed in O'Toole et al. 2009). 
This is a consequence of the fact that more eccentric orbits are 
spikier and thus more sensitive to phase and frequency. 

Other than O'Toole et al. (2009) we plot the periodograms 
against frequency^ to illustrate that the typical peak width 8f is 
frequency independent. Thus equidistant frequency steps (d/ < 
6f) yield a uniform sampling of each peak and are the most eco- 
nomic way to compute the periodogram rather than e.g. loga- 
rithmic period steps (leading to oversamphng at long periods: 
d/ = di = - ;JidP = - ;^d hi P) as used by O'Toole et al. (2009). 
Still the periodograms can be plotted against a logarithmic pe- 
riod scale as e.g. sometimes preferred to present a period search 
for exoplanets. 

Figure 4 visualises local optima in the p^j^ map at an arbi- 
trary fixed frequency. There are two obvious local optima which 
means that searching from only one initial value for To may be 
not sufficient as one could fail to lead the best local optimum in 
the e-To plane. This justifies a stepping through e and To. The 
complexity in the c-Tq plane, in particular at high eccentricities, 
finally translates into the Keplerian periodogram. When vary- 
ing the frequency the landscape and maxima will evolve and the 
maxima can also switch. 

In the given example LS and GLS would give a good initial 
guess for the best Keplerian period with only a sUght frequency 
shift. But this is not always the case. 



V'' "'"Z andcosv= f^^. 



for Fourier transforms this is common 
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Fig. 1. The radial velocity (RV) time series of the M dwarf GJ 1046. 
The solid line is the best Keplerian orbit tit (P = 168.8 d, e = 0.28). 
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Fig. 2. Comparison of the normalized Lomb-Scargle (LS), GLS and 
Keplerian periodograms for GJ 1046 (/ = = ^). 
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Fig. 3. The same periodograms as in Fig. 2 with a quasi-logarithmic 
scale for p and a logarithmic scale for^^ (axis to the right). 

One may argue, that the second peak has an equal height sug- 
gesting the same significance. On a linear scale it seems so. But 
the significance is not a linear function of the power. Gumming 
et al. (2008) normalised the Keplerian periodogram as 



ZKep(w) 



N-5xl-xH(^) N-5 Pk,v(co) 



analogous to Eq. (23) and derived the probability distribution 



(29) 



Prob(z > zo) = 1 + — r 111 + 



2 N-5 



N-5 



With this we can calculate that the higher peak which is much 
closer to 1 has a lO"'** times lower probability to be due to noise, 




90 180 270 

Periastronpassage Tg [deg] 

Fig. 4. Power map (pe.T„) for e and Tq at the arbitrary fixed frequency 
/ = 0.00422 d"'. The maximum value p = 0.592 is deposited in the 
Keplerian periodogram. Note that there are the two local optima. 

i.e. it has a lO''* times higher significance. In Fig. 3 the peri- 
odogram power is plotted on a logarithmic scale for on the 
right-hand side. The much lower is another convincing argu- 
ment for the much higher significance. 

Gumming (2004) suggested to estimate the FAP for the pe- 
riod search analogous to Eq. (25) and the number of independent 
frequencies again as M ^ TAf. This estimation does not take 
into account the higher variability in the Keplerian periodogram, 
which depends on the examined eccentricity range, and therefore 
this FAP is likely to be underestimated. 

Another, more extreme example is the planet around 
HD 20782 discovered by Jones et al. (2006). Figure 5 shows 
the RV data for the star taken from O'Toole et al. (2009). Due 
to the high eccentricity this is a case where LS and GLS fail to 
find the right period. However, our algorithm for the Keplerian 
periodogram find the same solution as the 2DKLS (P = 59L9 d, 
e - 0.97). The Keplerian periodogram in Fig. 6 indicates this pe- 
riod. This time it is normalised according to Eq. (29) and seems 
to suffer from an overall high noise level (caused by many other 
eccentric solutions that will fit the one 'outlier'). However, that 
the period is significant can again be shown just as in the previ- 
ous example. 

For comparison we also show the periodogram with the nor- 
malisation by the best fit at each frequency (Gumming 2004) 



N-5xl-x\(^) 



4 



xHco) 



(30) 



which is used in the 2DKLS and reveals the power maximum as 
impressively as in O'Toole et al. (2009, Fig. lb). As Gumming 
et al. (1999) mentioned the choice of the normalisation is a mat- 
ter of taste; the distribution of maximum power is the same. 
Finally, keep in mind when comparing Fig. 6 with the 2DKLS 
in O'Toole et al. (2009, Fig. lb) which shows a slice at e = 0.97, 
that the Keplerian periodogram in Fig. 6 includes all eccentrici- 
ties (0 < e < 0.99). Also the algorithms are different because we 
also step for Tq and simultaneously fit the longitude of periastron 

VJ. 

6. Conclusions 

Generalised Lomb-Scargle periodogram (GLS), floating-mean 
periodogram (Gumming et al. 1999), date-compensated discrete 
Fourier transform (DGDFT, Ferraz-Mello 1981), and "spectral 
significance" (SigSpec, Reegen 2007) at last all mean the same 
thing: least-squares spectrum for fitting a sinusoid plus a con- 
stant. Gumming et al. (1999) and Reegen (2007) have already 
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Fig. 5. The radial velocity (RV) time series of HD 20782. The solid line 
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Fig. 6. Keplerian periodogram for HD 20782. Both are the same 
Keplerian periodogram, but the upper one is normalized with the best 
fit (Eq. (29)) while the lower one is normalized with the best fit at each 
frequency (Eq. (30)). Both have by definition the same maximum value. 



shown the advantages of accounting for a varying zero point and 
therefore we recommend the usage of the generalised Lomb- 
Scargle periodogram (GLS) for the period analysis of time se- 
ries. The implementation is easy as there are only a few modifi- 
cations in the sums of the Lomb-Scargle periodogram. 

The GLS can be calculated as conveniently as the Lomb- 
Scargle periodogram and in a straight forward manner with an 
analytical solution while programs applying standard routines 
for fitting sinusoids involve solving a set of linear equations by 
inverting a 3 x 3 matrix repeated at each frequency. The GLS 
can be tailored by concentrating the sums in one loop over the 
data. As already mentioned by Lomb (1976) Eq. (5) (including 
Eqs. (13)-(18)) should be applied for the numerical work. A fast 
calculation of the GLS is especially desirable for large samples, 
large data sets and/or many frequency steps. It also may be help- 
ful to speed up prewhitening procedures (e.g. Reegen 2007) in 
case of multifrequency analysis or numerical calculations of the 
significance of a signal with bootstrap methods or Monte Carlo 
simulations. 

The term generalised Lomb-Scargle periodogram has al- 
ready been used by Bretthorst (2001) for the generalisation to 
sinusoidal functions of the kind: y{t) - aZ(t) cos a)t+bZ(t) sin ojt 
with an arbitrary amplitude modulation Z(f) whose time depen- 
dence and all parameters are fully specified (e.g. Z(f) can be 
an exponential decay). Without repeating the whole procedure 
given in A.l, A. 2 and Sect. 2 it is just mentioned here that 
the generalisation to y(t) - aZ{t) cos lot + bZ{t)sinojt + c will 
result in the same equations with the difference that Z(f,) has 



to be attached to each sine and cosine term in each sum (e.g. 
CC - 2 WiZ{tj) cos (L)ti ■ Z(ti) cos (jL>tj). 

We presented an algorithm for the application of GLS to the 
Keplerian periodogram which is the least-squares spectrum for 
Keplerian orbits. It is an hybrid algorithm that partially applies 
an analytic solution for linearised parameters and partially steps 
through non-linear parameters. This has to be distinguished from 
methods that use the best sine fit as an initial guess. With two 
examples we have demonstrated that our algorithm for the com- 
putation of the Keplerian periodogram is capable to detect (very) 
eccentric planets in a systematic and nonrandom way. 

Apart from this, the least-squares spectrum analysis (the idea 
goes back to Vanicek 1971) with more complicated model func- 
tions than full sine functions is beyond the scope of this paper 
(e.g. including linear trends. Walker et al. 1995 or multiple sine 
functions). For the calculation of such periodograms the employ- 
ment of the analytical solutions is not essential, but can be faster. 

Appendix A: Appendix 

A.1. Derivation of tiie generalised Lomb-Scargle 
periodogram (GLS) 

The derivation of the generalised Lomb-Scargle periodogram is 
briefly shown. With the sinusoid plus constant model 

y{t) = a cos (jjt + b sin (jjt + c 

the squared difference between the data yi and the model func- 
tion y{t) 

^wY^Wi[yi-y(ti)f 

is to be minimised. For the minimum the partial derivatives 
vanish and therefore: 



= daX^ = 2 Wilyi - ym cos coU (A. 1) 

= dhX^ = 2W ^ Wi[yi - ym sin wf; (A.2) 
= 5,/ = 2W ^ - ym (A.3) 
These conditions for the minimum give three linear equations: 



YC 




CC CS C 




a 


YS 




cs s^s s 




b 


Y 




C S 1 




c 



where the abbreviations in Eqs. (7)-(15) were applied. 
Eliminating c in the first two equations with the last equation 
(c = y-flC-Z75) yields: 



YC-YC 
YS - Y S 



CC-C C CS - C S 
CS - C S s^s - s s 



Using again the notations of Eqs. (11)-(15) this can be written 
as: 



YC 




CC 


CS 




a 


YS 




CS 


ss 




b 



So the solution for the parameters a and b is 

YC SS -YS CS , , YS CC-YC CS 
a = and b 



D 



D 



(A.4) 



The amplitude of the best-fitting sine function at frequency w is 
given by Va^ -i- b^. With these solutions the minimum can be 
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written only in terms of the sums Eqs. (10)-(15) when eliminat- 
ing the parameters a, b and c as shown below. With the condi- 
tions for the minimum Eqs. (A.1)-(A.3) it can be seen that: 

^ Wi\yi - y{ti)]y{ti) = a ^ vv, [>'; - yiti)] cos (oU 

+ Wi[yi - y{ti)] sin coU 

+ c^Wi\yi - y{ti)\ 
= 0. 

Therefore, the minimum can be written as: 

x\oS)lw = Yj wibi - yitiM - Yj ^'tF' - y^^d^yiti) 

=0 

= YY - aYC - bY''S - cY 

= YY-Y-Y-aiYC-Y-C)-biY''S -Y-S) 

= YY- aYC - bYS 

where in the last step again the definitions of Eqs. (10)-(12) were 
applied. Finally a and b can be substituted by Eq. (A.4): 



D 



D 



D 



When now using the -reduction normalised to unity: 



pioj) = 



Xo 
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and the fact that^o = W -YY, Eq. (5) will result. 



A.2. Verification of Eq. (19) 

Eq. (19) can be verified with the help of trigonometric addition 
theorems. For this purpose CS must be formulated. Furthermore, 
the index r and the notation (p = ojt will be used: 

2CS r =Yi^' ~ 

— 2 ^ Wi cos(wf; - (f) ^ Wi sin(wf,- - ip) 

- cos 2(p ^ Wi sin 2a)ti - sin 2(f ^ w,- cos 2a)ti 

- 2(C cos (fi + S sin (p){S cos ^ - C sin ip) 
Expanding the last term yields 

2CS T -2CS cos2(p - {CC - S S)sm2(p 
- 2 [C ■ 5 (cos^ (fi - sin^ <p) - (C^ 

and after factoring cos 2^ and sin 2^: 

2CS r ^2(CS - C ■ S)cos2(p- {cC - 55 ■ 
=2CS cos 2(/j - (CC - 5 5 ) sin 2^ 
So for CSj = 0, (fi - COT has to be chosen as 

2C5 



S ^) sin (fi cos 95J 
(C^- 5 2)) sin 2^ 



tan2(yr : 



cc-ss 



By the way, replacing the generalised sums CC, SS and CS by 
the classical ones leads to the original definition for f in Eq. (3): 



tan 2a)f = 



2CS 



2 sin 2LL>ti 
CC-S^S ~ Xcos2ojti' 



